Parmesan: Meteor without Paraphrases with Paraphrased References

نویسنده

  • Petra Barancikova
چکیده

This paper describes Parmesan, our submission to the 2014 Workshop on Statistical Machine Translation (WMT) metrics task for evaluation English-to-Czech translation. We show that the Czech Meteor Paraphrase tables are so noisy that they actually can harm the performance of the metric. However, they can be very useful after extensive filtering in targeted paraphrasing of Czech reference sentences prior to the evaluation. Parmesan first performs targeted paraphrasing of reference sentences, then it computes the Meteor score using only the exact match on these new reference sentences. It shows significantly higher correlation with human judgment than Meteor on the WMT12 and WMT13 data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Translation within One Language as a Paraphrasing Technique Online

We present a method for improving machine translation (MT) evaluation by targeted paraphrasing of reference sentences. For this purpose, we employ MT systems themselves and adapt them for translating within a single language. We describe this attempt on two types of MT systems – phrase-based and rule-based. Initially, we experiment with the freely available SMT system Moses. We create translati...

متن کامل

HIT2016@DPIL-FIRE2016: Detecting Paraphrases in Indian Languages based on Gradient Tree Boosting

Detecting paraphrase is an important and challenging task. It can be used in paraphrases generation and extraction, machine translation, question and answer and plagiarism detection. Since the same meaning of a sentence is expressed in another sentence using different words, it makes the traditional methods based on lexical similarity ineffective. In this paper, we describe a strategy of Detect...

متن کامل

Multilingual WSD-like Constraints for Paraphrase Extraction

The use of pivot languages and wordalignment techniques over bilingual corpora has proved an effective approach for extracting paraphrases of words and short phrases. However, inherent ambiguities in the pivot language(s) can lead to inadequate paraphrases. We propose a novel approach that is able to extract paraphrases by pivoting through multiple languages while discriminating word senses in ...

متن کامل

FUN-NRC: Paraphrase-augmented Phrase-based SMT Systems for NTCIR-10 PatentMT

This paper describes FUN-NRC group’s machine translation systems that participated in the NTCIR-10 PatentMT task. The central motivation of this participation was to clarify the potential of automatically compiled collections of sub-sentential paraphrases. Our systems were built using our baseline phrase-based SMT system by augmenting its phrase table with novel translation pairs generated by c...

متن کامل

Meteor Universal: Language Specific Translation Evaluation for Any Target Language

Parameter set learned using all WMT12 data (Callison-Burch et al., 2012): • 100,000 binary rankings covering 8 language directions. •Restrict scoring for all languages to exact and paraphrase matching. Parameters encode human preferences that generalize across languages: •Prefer recall over precision. •Prefer word choice over word order. •Prefer correct translations of content words over functi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014